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DETAILED ACTION 



Claim Rejections - 35 USC § 102 



1. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 122(b), by 
another filed in the United States before the invention by the applicant for patent or (2) a patent 
granted on an application for patent by another filed in the United States before the invention by the 
applicant for patent, except that an international application filed under the treaty defined in section 
351(a) shall have the effects for purposes of this subsection of an application filed in the United States 
only if the international application designated the United States and was published under Article 21(2) 
of such treaty in the English language. 

2. Claims 1-6, 14-21, 25-30, 32 and 33 are rejected under 35 U.S.C. 102(e) as 
being anticipated by Ramaswamy et al. (U.S. Patent No. 6,188,976 filed Oct. 23, 1998). 

As per claims 1, 18, 19, 20, 27 and 28 Ramaswamy et al. discloses a method 
comprising: 

developing a language model from a tuning set of information (C.2. lines 44-48); 

segmenting at least a subset of received textual corpus and calculating a 
perplexity value for each segment (C.4. lines 13-20-the external corpus is segmented 
into linguistic units and a perplexity value is calculated for each unit); 

refining the language model with one or more segments of the received corpus 
based, at least in part, on the calculated perplexity value for the one or more segments 
(C.3. lines 47-52, C.4. lines 45-47-the language model is updated based upon the 
perplexity value). 

As per claim 2, 21 , Ramaswamy et al. discloses all of the limitations of claim 1 , 
upon which claim 2 depends. Ramaswamy et al. further discloses: 
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the tuning set of information (C.S.Iines 36-38-test corpus) is application specific 
(C.S.Iines 40-44-the application is speech recognition). 

As per claim 3, Ramaswamy et al. discloses all of the limitations of claim 1 , upon 
which claim 3 depends. Ramaswamy et al. further discloses: 

the tuning set of information is comprised of one or more application-specific 
documents (C. 7. lines 6-9,-the application is e-mail, the documents comprise "show me 
the next e-mail...") 

As per claim 4, Ramaswamy et al. discloses all of the limitations of claim 1 , upon 
which claim 4 depends. Ramaswamy et al. further discloses: 

the tuning set of information is a highly accurate set of textual information 
linguistically relevant to (C.2. lines 55-58), but not taken from, the received textual 
corpus (C. 3. lines 14-18, the received corpus-external corpus comprises many domains, 
however the seed corpus is linguistically related, but not taken from the external 
corpus). 

As per claim 5, Ramaswamy et al. discloses all of the limitations of claim 1 , upon 
which claim 5 depends. Ramaswamy et al. further discloses: 

a training set comprised of at least the subset of the received textual corpus 
(C.3. lines 6-8,1 4-1 7-test corpus is at least the subset of the received textual corpus). 

As per claim 6, Ramaswamy et al. discloses all of the limitations of claim 5, upon 
which claim 6 depends. Ramaswamy et al. further discloses: 

ranking the segments of the training set based, at least in part, on the calculated 
perplexity value for each segment (CAIines 36-41 , C. 8. lines 34-36), 
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As per claim 14, Ramaswamy et al. discloses all of the limitations of claim 1 , 
upon which claim 14 depends. Ramaswamy et al. further discloses: 

the perplexity value is a measure of the predictive power of a certain language 
model to a segment of the received corpus (C.4. lines 16-21). 

As per claim 15, Ramaswamy et al. discloses all of the limitations of claim 1 , 
upon which claim 15 depends. Ramaswamy et al. further discloses: 

ranking the segments of at least the subset of the received corpus based, at least 
in part, on the calculated perplexity value of each segment (CAIines 36-40, C.8. lines 
34, 35); and 

updating the tuning set of information with one or more of the segments from at 
least the subset of the received corpus (C.4. lines 41-47). 

As per claim 16, Ramaswamy et al. discloses all of the limitations of claim 15, 
upon which claim 16 depends. Ramaswamy et al. further discloses: 

one or more of the segments with the lowest perplexity value from at least the 
subset of the received corpus are added to the tuning set (C.4. lines 41-47- "...below the 
perplexity threshold ..."). 

As per claims 17 and 25, Ramaswamy et al. discloses all of the limitations of 
claim 1 , upon which claim 17 depends. Ramaswamy et al. further discloses: 

utilizing the refined language model in an application (C.S.Iines 40-42, the 
application is speech recognition) to predict a likelihood of another corpus (C.S.Iines 42- 
45-the likelihood is interpreted as the "accuracy... for the current language model"-the 
other corpus is the test corpus). 
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As per claim 26, Ramaswamy et al. discloses all of the limitations of claim 25, 
upon which claim 26 depends. Ramaswamy et al. further discloses: 

the application is one or more of a spelling and/or grammar checker, a word- 
processor, a speech recognition application, a language translation application, and the 
like (C. 5. lines 40-42, the application is speech recognition). 

As per claim 29, Ramaswamy et al. discloses all of the limitations of claim 28, 
upon which claim 29 depends. Ramaswamy et al. further discloses: 

the tuning set is dynamically selected as relevant to the received corpus 
(C.3.lines 47-54). 

As per claim 30, Ramaswamy et al. discloses all of the limitations of claim 28, 
upon which claim 30 depends. Ramaswamy et al. further discloses: 

a dynamic lexicon generation function, to develop an initial lexicon from the 
tuning set (C. 3. lines 42-44-the tuning set (seed corpus) is used to develop an initial 
lexicon (corpus)), and to update the lexicon with the select segments from the received 
corpus (C. 3. lines 50-55- "...adding linguistic units to relevant corpus"-the relevant 
corpus being the updated lexicon). 

As per claim 32, Ramaswamy et al. discloses all of the limitations of claim 28, 
upon which claim 32 depends. Ramaswamy et al further discloses: 

a dynamic segmentation function (C.5.lines 1-3), to iteratively segment the 
received corpus (C.S.Iines 1-3) to improve a predictive performance attribute of the 
modeling agent (C.S.Iines 6-9-"to improve language model quality..." comprising 
evaluating perplexity change which is interpreted as the predictive performance). 
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As per claim 33, Ramaswamy et al. discloses all of the limitations of claim 32, 
upon which claim 33 depends. Ramaswamy et al further discloses: 

the dynamic segmentation function iteratively re-segments the received corpus 
until the language model reaches an acceptable threshold (C.S.Iines 1,2, 9-15-the 
external corpus is segmented, iteratively by extracting linguistic units, until the language 
model is updated once a "...a certain number..." a threshold is reached). 

Claim Rejections - 35 USC § 103 

3. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

4. Claims 7-13, 22-24, 31, 34 and 35 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Ramaswamy et al in view of Bangalore et al. (U.S. Patent No. 
6,317,707 filed Dec. 7,1998) 

Ramaswamy et al. and Bangalore et al. are analogous art in that they both deal 
with language modeling. 

As per claims 7 and 24, Ramaswamy et al. discloses all of the limitations of 
claim 1 , upon which claim 7 depends. Ramaswamy et al. further discloses: 

clustering every N-items of the received corpus into a training unit, wherein 
resultant training units are separated by gaps (C. 6. line 67, C. 7. lines 1 , 2-the separate 
classes inherently includes gaps); 

Ramaswamy et al. does not disclose: 
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calculating the similarity within a sequence of training chunks on either side of 
each of the gaps; and 

select segment boundaries that maximize intra-segment similarity and inter- 
segment disparity. 

However, as it is well known in the art, Bangalore et al. teaches calculating the 
similarity within a sequence of training chunks (C.3. lines 15-18, 22, 23-the calculated 
radius determines the similarity) and selecting segment boundaries that maximize intra- 
segment similarity and inter-segment disparity (C. 3. lines 15, 16-the radius indicates the 
selected boundaries and compactness maximizes segment similarity and inter-segment 
disparity). Therefore, it would have been obvious at the time of the invention to combine 
Ramaswamy et al. with Bangalore et al. The motivation for doing so would have been to 
incorporate a well known clustering method of training data/chunks to group similar 
items and diverge dissimilar items. 

As per claim 8, Ramaswamy et al. and Bangalore et al. disclose all of the 
limitations of claim 7, upon which claim 8 depends. Ramaswamy et al. further discloses: 

the resultant segment defines a training chunk (C. 7. lines 14-18-the word class is 
the chunk that is then used in subsequent processing steps). 

As per claim 9, Ramaswamy et al. and Bangalore et al. disclose all of the 
limitations of claim 7, upon which claim 9 depends. Ramaswamy et al. does not 
disclose: 

N is an empirically derived value based, at least in part, on the size of the 
received corpus. 
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However, as it is well known in the art, Bangalore et al. teaches having an 
empirically derived N-vector for each item in the corpus, which thereby is based upon 
the size of the corpus (C. 2. lines 59-65) and every item is included in the vector space 
(C. 3. lines 7,8, Fig 2). Therefore, at the time of the invention, it would have been obvious 
to combine Ramaswamy et al. with Bangalore et al. The motivation for doing so would 
have been to include every item in the clustering process to better improve subsequent 
language modeling results. 

As per claim 10, Ramaswamy et al. and Bangalore et al. disclose all of the 
limitations of claim 7, upon which claim 10 depends. Ramaswamy et al. does not 
disclose: 

the calculation of the similarity within a sequence of training units defines a 
cohesion score. 

However, as it is well known in the art, Bangalore et al. teaches the calculation of 
the similarity within a sequence of training units (C.3. lines 22, 23) defines a cohesion 
score (C. 3. lines 15-19 "very close relationship.." is interpreted as the cohesion). 
Therefore, at the time of the invention, it would have been obvious to combine 
Ramaswamy et al. with Bangalore et al. The motivation for doing so would have been to 
determine how close or similar the training units were to each other for the benefit of 
maximizing the clustering process of related items, to better improve subsequent 
language modeling results. 
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As per claim 11, Ramaswamy et al. and Bangalore et al. disclose all of the 
limitations of claim 10, upon which claim 1 1 depends. Ramaswamy et al. does not 
disclose: 

intra-segment similarity is measured by the cohesion score. 

However, as it is well known in the art, Bangalore et al. teaches intra-segment 
similarity is measured by the cohesion score (C.3.lines 15-19, 22, 23). Therefore, at the 
time of the invention, it would have been obvious to combine Ramaswamy et al. with 
Bangalore et al. The motivation for doing so would have been to measure how close or 
similar the intra-segment training units were to each other for the benefit of maximizing 
the clustering process of related items, to better improve subsequent language 
modeling results. 

As per claim 12, Ramaswamy et al. and Bangalore et al. disclose all of the 
limitations of claim 7, upon which claim 12 depends. Ramaswamy et al. does not 
disclose: 

inter-segment disparity is approximated from the cohesion score. 

However, as it is well known in the art, Bangalore et al. teaches inter-segment 
(C.3. lines 24, 25-the different vector coordinates interpreted inter-segments) is 
approximated form the cohesion score (C.4, lines 35-45, Table 2-the "Compactness 
Value"-determines the score and cohesion and the "Class Index'-determines the inter- 
segment disparity resulting from the cohesion score). Therefore, at the time of the 
invention, it would have been obvious to combine Ramaswamy et al. with Bangalore et 
al. The motivation for doing so would have been to determine how disparate or distinct 
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the inter-segment training units were to each other for the benefit of maximizing the 
clustering process of related items, to better improve subsequent language modeling 
results. 

As per claim 13, Ramaswamy et al. and Bangalore et al. disclose all of the 
limitations of claim 7, upon which claim 13 depends. Ramaswamy et al. does not 
disclose: 

the calculation of inter-segment disparity defines a depth score. 

However, as it is well known in the art, Bangalore et al. teaches the calculation of 
inter-segment disparity defines a depth score (C.4. lines 12-16, 30-66-Table 2 the depth 
of the inter-segment disparity approximated form the cohesion score-compactness 
value- is indicated as the values are "deeper" as they are farther down the list). 
Therefore, at the time of the invention, it would have been obviousness to combine 
Ramaswamy et al. with Bangalore et al. The motivation for doing so would have been to 
determine the depth of the disparity in a ranked manner to visually determine the 
relatedness of different classes or inter-segment disparity. 

As per claim 22, Ramaswamy et al. discloses all of the limitations of claim 20, 
upon which claim 22 depends. Ramaswamy et al. does not disclose: 

the language model agent ranks the segments of the training set based, at least 
in part, on a measure of similarity between two or more segments. 

However, as it is well known in the art, Bangalore et al. teaches ranking the 
segments of a training set based on a measure of similarity (C.4. lines 9-16, 
compactness value between segments) between segments. Therefore, at the time of 




Application/Control Number: 09/607,786 Page 1 1 

Art Unit: 2654 

the invention, it would have been obvious to combine Ramaswamy et al. with Bangalore 
et al. The motivation for doing so would have been to identify by a ranking system the 
segments of varied similarity measurements in order to maximize the clustering process 
to further improve any successive language modeling resulting from using the provided 
clustering data. 

As per claim 23, Ramaswamy et al. and Bangalore et al. disclose all of the 
limitations of claim 22, upon which claim 23 depends. Ramaswamy et al. does not 
disclose: 

the similarity measure is calculated for adjacent segments. 

However, as it is well known in the art, Bangalore et al. teaches having a 
similarity measure calculated for adjacent segments (C.2. lines 29-31, C. 2. lines 59-65, 
C.3.line 1 , C.3.lines 1 5-1 7). Therefore, at the time of the invention, it would have been 
obvious to combine Ramaswamy et al. with Bangalore et al. The motivation for doing so 
would have been to obtain similarity measurements of adjacent segments in order to 
maximize the clustering process to further improve any successive language modeling 
resulting from using the provided clustering data. 

As per claim 31, Ramaswamy et al. discloses all of the limitations of claim 28, 
upon which claim 31 depends. Ramaswamy et al. does not disclose: 

a frequency analysis function, to determine a frequency of occurrence of 
segments within the received corpus. 

However, as it is well known in the art, Bangalore et al. teaches having a function 
based upon frequencies for each input word, which determines the frequencies of 
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segments within the received corpus (C.2. lines 59, 60). Therefore, at the time of the 
invention, it would have been obvious to combine Ramaswamy et al. with Bangalore et 
al. The motivation for doing so would have been to assist in building a cluster in the well 
known method of having a vector space to hold the clusters with the frequency of each 
segment being incorporated into the cluster for the benefit of maximizing the clustering 
segments, to better improve subsequent language modeling results. 

As per claim 34, Ramaswamy et al. discloses all of the limitations of claim 32, 
upon which claim 34 depends. Ramaswamy et al. does not disclose: 

a frequency analysis function, to determine a frequency of occurrence of 
segments within the received corpus. 

However, as it is well known in the art, Bangalore et al. teaches having a function 
based upon frequencies for each input word, which determines the frequencies of 
segments within the received corpus (C.2.lines 59, 60). Therefore, at the time of the 
invention, it would have been obvious to combine Ramaswamy et al. with Bangalore et 
al. The motivation for doing so would have been to assist in dynamically building a 
cluster in the well known method of having a vector space to hold the clusters with the 
frequency of each segment being incorporated into the cluster for the benefit of 
maximizing the clustering segments, to better improve subsequent language modeling 
results. 

As per claim 35, Ramaswamy et al. and Bangalore et al. disclose all of the 
limitations of claim 34, upon which claim 35 depends. Ramaswamy et al. further 
discloses: 
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the data structure generator removes segments from the data structure that do 
not meet a minimum frequency threshold (C.4. lines 29-31 -it is well known that the 
relevancy of the segments is based in part on frequency), and dynamically re-segments 
the received corpus to improve predictive capability while reducing the size of the data 
structure (C.S.Iines 1-3, C.5. lines 6-9-"to improve language model quality..." comprising 
evaluating perplexity change which is interpreted as the predictive performance). 

Conclusion 



5. The prior art made of record and not relied upon is considered pertinent to 

applicant's disclosure. 

Wong (U.S. Patent No. 5,905,773) teaches application specific dynamic 
language modeling and reducing perplexity of a corpus. 
Ushioda (U.S. Patent No. 5,835,893 Nov. 10, 1998) teaches using 
clustering segments from a textual corpus based on frequencies and 
maximizing intra-segment similarity. 

Tillmann et al. (U.S. Patent 6,182,026 filed Jun. 26, 1998) teaches of 
dynamic language modeling and adjusting a lexicon to determine 
maximum approximation for segment alignment 
Strong (U.S. Patent No. 5,613,036 Mar. 18, 1997) teaches using a 
dynamic language modeling system 
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Mahajan etal. (U.S. Patent No. 6,418,431 filed Mar. 30, 1998) teaches 
using a perplexity value to determine the relevancy of a language model to 
a segment of information. 

Gorin et al. (U.S. Patent No. 6,044,337 filed Oct. 29, 1997) teaches 
calculating a perplexity value and elimination of candidate segments that 
are cut off by a threshold. 

Kanevsky et al. (U.S. Patent No. 6,484,136 filed Oct. 21, 1999) teaches 
utilizing an updated language model to predict a probability of another 
corpus. 

Wyard et al. (U.S. Patent No. 6,167,398 filed May 13, 1998) teaches using 
language analysis methods (including n-grams) to determine similarity and 
dissimilarity between segments of information. 

Bahl et al. (U.S. Patent No. 5,195,167 filed Mar. 16, 1993) teaches ranking 
and grouping items based on similarity using statistics. 



6. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Lamont M Spooner whose telephone number is 
703/305-8661 . The examiner can normally be reached on 8:00 AM - 5:00 PM. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Talivaldis Smits can be reached on 703/306-301 1 . The fax phone number 
for the organization where this application or proceeding is assigned is 703-872-9306. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 



Ims 

03/18/04 
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